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We present first evidence for WW+W Z production in lepton+jets final states at a hadron collider. 
The data correspond to 1.07 fb" 1 of integrated luminosity collected with the DO detector at the 
Fermilab Tevatron in pp collisions at ^/s — 1.96 TeV. The observed cross section for WW + WZ 
production is 20.2 ± 4.5 pb, consistent with the standard model and more precise than previous 
measurements in fully leptonic final states. The probability that background fluctuations alone 
produce this excess is < 5.4 x 10~ 6 , which corresponds to a significance of 4.4 standard deviations. 

PACS numbers: 14.70.Fm, 14.70.Hp, 13.85.Ni, 13.85. Qk 



The production of vector-boson pairs in pp collisions (WW, WZ, or ZZ) provides important tests of the elec- 
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troweak sector of the standard model (SM). The next- 
to-leading-order (NLO) cross sections for WW and WZ 
production in pp collisions at </s = 1.96 GeV pre- 
dicted by the SM are a {WW) = 12.4 ± 0.8 pb and 
a(WZ) = 3.7 ± 0.3 pb [1]. A discrepancy with this ex- 
pectation or deviations in the predicted kinematic dis- 
tributions could signal the presence of new physics, e.g., 
originating from anomalous trilinear gauge boson cou- 
plings [2j]. The production of two weak bosons is also 
relevant to searches for the Higgs boson or for new par- 
ticles in extensions of the SM. Production of WW and 
WZ in pp collisions at the Fermilab Tevatron Collider 
has thus far been observed only in fully leptonic decay 
modes 0, 0| • Previous searches for WW and WZ in lep- 
ton+jets final states @, @, which benefit from a higher 
branching ratio relative to fully leptonic channels, were 
hindered by large backgrounds from jets produced in as- 
sociation with a W boson (IV+jets). 

In this Letter we report first evidence from a hadron 
collider for the production of a W boson that decays lep- 
tonically, associated with a second vector boson V (V=W 
or Z) that decays into qq (WV— > lvqq\ £=e, fi). The lim- 
ited dijet mass resolution 18% for dijets from W/Z de- 
cays) results in a significant overlap of the W-* qq and 
> qq dijet mass peaks. We therefore consider WW 
and WZ simultaneously, assuming the ratio of their cross 
sections as predicted by the SM. The use of improved 
multivariate event classification and new statistical tech- 
niques 0, as well as an increased integrated luminos- 
ity, make the WV signal in lepton+jets final states more 
distinguishable from IV+jets background and more ac- 
cessible to measurement than in the past 0,0. This 
analysis also provides a valuable proving ground for such 
advanced techniques, now ubiquitous in Higgs searches 
at the Tevatron. 

We analyze 1.07 fb _1 of data collected with the DO 
detector [8( at a center-of-mass energy of 1.96 TeV at 
the Tevatron. Candidate evqq events must pass a trigger 
based on a single electron or clcctron+jet(s) requirement 
that has an efficiency of 98 A suite of triggers for 
fivqq candidate events achieves an efficiency of > 95% at 
95% confidence level. 

To select WV—> ivqq candidates, we require: a sin- 
gle reconstructed lepton (electron or muon) 0] with 
transverse momentum pt > 20 GeV and pseudorapid- 
ity \r]\ < 1.1 (2.0) for electrons (muons); the imbalance 
in transverse energy to be J$t > 20 GeV; and at least two 
jets [HI withp T > 20 GeV and \r)\ < 2.5. The jet of high- 
est pt must have pr > 30 GeV. To reduce background 
from processes that do not contain W— > iv, we require 
a "transverse" mass [ll| of M T V > 35 GeV. The lepton 
must be spatially matched to a track reconstructed in the 
central tracker that originates from the primary vertex. 
Electrons (muons) must be isolated from other particles 
in the calorimeter (and central tracker) 

Signal and background processes containing charged 



leptons are modeled via Monte Carlo (MC) simulation. 
The signal includes all possible W and Z decays, includ- 
ing their decays to leptons. The diboson signal (WW 
and WZ) is generated with PYTHIA [13| using CTEQ6L 
parton distribution functions (PDFs). The fixed-order 
matrix element (FOME) generator ALPGEN [lj] with 
CTEQ6L1 PDFs is used to generate IV+jets, Z+jets, 
and ti events to leading order at the parton level. The 
FOME generator COMPHEP [3] is used to produce single 
top-quark MC samples. ALPGEN and COMPHEP are in- 
terfaced to pythia for subsequent parton showering and 
hadronization. All simulated events undergo a GEANT- 
based fl6j | detector simulation and are reconstructed us- 
ing the same programs as used for DO data. The MC sam- 
ples are normalized using next-to-leading-order (NLO) 
or next-to-next-to-leading-order predictions for SM cross 
sections, except TV -(-jets which is scaled to the data. 

The probability for multijet events with misidenti- 
fied leptons to pass all selection requirements is small; 
however, because of the copious production of multijet 
events, the background from this source cannot be ig- 
nored. For fivqq, the multijet background is modeled 
with data that fail the muon isolation requirements, but 
pass all other selections. The normalization is deter- 
mined from a fit to the M T V distribution. For evqq, the 
multijet background is estimated using a "loose-but-not- 
tight" data sample obtained by selecting events that pass 
loosened electron qualityrequirements, but fail the tight 
electron quality criteria [9j . This sample is normalized by 
the probability for a jet that passes the "loose" electron 
requirements to also pass the tight requirement. Both 
[ivqq and evqq multijet samples are corrected for contri- 
butions from all processes modeled through MC. 

Accurate modeling of the selected events is vital. The 
dominant background is IV+jets, and the modeling of 
ALPGEN IV+jets and sources of uncertainty are there- 
fore studied in great detail. Comparison of ALPGEN 
with other generators and with data shows discrepan- 
cies [13] in jet 77 and dijet angular separation. Data are 
used to correct these quantities in the ALPGEN IV+jets 
and Z+jets samples. The possible bias in this proce- 
dure from the presence of the diboson signal in data is 
small, but is nevertheless taken into account via a sys- 
tematic uncertainty. Systematic effects on the differen- 
tial distributions of the ALPGEN IV+jets and Z+jets MC 
events from changes of the renormalization and factor- 
ization scales and of the parameters used in the MLM 
parton-jet matching algorithm [3] are also considered. 
Uncertainties on PDFs, as well as uncertainties from ob- 
ject reconstruction and identification, are evaluated for 
all MC samples. We consider the effect of systematic un- 
certainty both on the normalization and on the shape of 
differential distributions for signal and backgrounds [l9| . 

The signal and the backgrounds are further separated 
using a multivariate classifier to combine information 
from several kinematic variables. This analysis uses a 
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FIG. 1: (a) The RF output distribution from the combined 
evqq and fJ,vqq channels for data and MC predictions following 
the fit of MC to data, (b) A comparison of the extracted sig- 
nal (filled histogram) to background-subtracted data (points) , 
along with the ±1 standard deviation (s.d.) systematic un- 
certainty on the background. The residual distance between 
the data points and the extracted signal, divided by the total 
uncertainty, is given at the bottom. 



FIG. 2: (a) The dijet mass distribution from the combined 
evqq and fivqq channels for data and MC predictions fol- 
lowing the fit to the RF output, (b) A comparison of the 
extracted signal (filled histogram) to background-subtracted 
data (points), along with the ±1 standard deviation (s.d.) 
systematic uncertainty on the background. The residual dis- 
tance between the data points and the extracted signal, di- 
vided by the total uncertainty, is given at the bottom. 



TABLE I: Measured number of events for signal and each 
background after the combined fit (with total uncertainties 
determined from the fit) and the number observed in data. 





evqq channel 


[ivqq channel 


Diboson signal 


436 ± 36 


527 ± 43 


W+jets 


10100 ± 500 


11910 ± 590 


Z+jets 


387 ± 61 


1180 ± 180 


ti + single top 


436 ± 57 


426 ± 54 


Multijet 


1100 ± 200 


328 ± 83 


Total predicted 


12460 ± 550 


14370 ± 620 


Data 


12473 


14392 



Random Forest (RF) classifier [2fJ, |2l|. Thirteen well- 
modeled kinematic variables [l9| that demonstrate a dif- 
ference in probability density between signal and at least 
one of the backgrounds, such as dijet mass and Ifix, are 
used as inputs to the RF. The RF is trained using half 
of each MC sample. The other halves, along with the 
multijet background samples, are then evaluated by the 
RF and used in the measurement. 

The signal cross section is determined from a fit of 



signal and background RF templates to the data by min- 
imizing a Poisson % 2 function with respect to variations 
in the systematic uncertainties Q- The magnitude of 
systematic uncertainties is effectively constrained by the 
regions of the RF distribution with low signal over back- 
ground. A Gaussian prior is used for each systematic 
uncertainty. Different uncertainties are assumed to be 
mutually independent, but those common to multiple 
samples or lepton channels are assumed to be 100% cor- 
related. 

The fit simultaneously varies the WV and W+jets 
contributions, thereby also determining the normaliza- 
tion factor for the W+jets MC sample. This obviates 
the need for using the predicted ALPGEN cross section, 
and provides a more rigorous approach that incorpo- 
rates an unbiased uncertainty from W+jets when ex- 
tracting the WV cross section. The normalization factor 
from the fit for the W+jets component is 1.53 ± 0.13, 
similar to the expected ratio of NLO to LO cross sec- 
tions 22]. The measured yields for signal and each 
background are given in Table HI Table [IT] contains 



6 



TABLE II: The signal cross section extracted Irom a simultaneous fit of the WV cross section and the normalization factor 
for IU+jets. Also given are expected and observed p- values obtained by comparing the measurement with pseudo-experiments 
assuming no signal and the corresponding significance in number of standard deviations (s.d.) for a one-sided Gaussian integral. 



Channel 


Fitted signal a (pb) 


Expected p-valuc (si; 


mificance) 


Observed p-value (significance) 


evqq RF Output 
[ivqq RF Output 
Combined RF Output 


18.0±3.7(stat)±5.2(sys)±l.l(lum) 
22.8±3.3(stat)±4.9(sys)±1.4(lum) 
20.2±2.5(stat)±3.6(sys)±1.2(lum) 


6.8 x KT 3 (2.5 
1.8 x 10~ 3 (2.9 
1.5 x 10" 4 (3.6 


s.d.) 
s.d.) 
s.d.) 


3.2 x 10" a (2.7 s.d.) 
5.2 x 10" 5 (3.9 s.d.) 
5.4 x 10" 6 (4.4 s.d.) 


Combined Dijet Mass 


18.5±2.8(stat)±4.9(sys)±l.l(lum) 


1.7 x 10 _a (2.9 


s.d.) 


4.4 x 1CT 4 (3.3 s.d.) 



the measured WV cross section for each channel, sep- 
arately and combined, showing consistent results be- 
tween channels and the SM prediction of a{WV) = 
16.1±0.9 pb [l|. The combined fit yields a cross section 
of 20.2 ± 2.5(stat) ± 3.6(sys) ± 1.2(lum) pb. The RF 
output distributions following the combined fit are shown 
in Fig. [TJ along with comparisons of consistency between 
the background-subtracted data and the extracted signal. 
Figure[2]shows analogous plots for the dijet mass after the 
combined fit to the RF output. The dominant system- 
atic uncertainties arise from the modeling of the lU+jets 
background and the jet energy scale, contributing 2.4 pb 
and 1.9 pb to the total systematic uncertainty [l9(, re- 
spectively. The position of the dijet mass peaks in data 
and MC are consistent within one half standard devi- 
ation, which includes the relative data/MC uncertainty 
in energy scale. As a cross check, we also perform the 
measurement using only the dijet mass distribution. The 
result, also given in Table [Til although less precise, is 
consistent with that obtained using the RF output. 

The significance of the measurement is obtained via 
fits of the signal+background hypothesis to pseudo-data 
samples drawn from the background-only hypothesis [23| . 
The observed (or expected) significance corresponds to 
the fraction of outcomes that yield a WV cross section 
at least as large as that measured in data (as predicted 
by the SM). The probabilities that background fluctu- 
ations could produce the expected and observed signal 
in each channel (p- values), separately and combined, are 
shown in Table HT1 along with their corresponding signifi- 
cance (equivalent one-sided Gaussian probabilities). The 
% 2 fit with respect to variations in the systematic un- 
certainties 0] results in an improvement of the expected 
significance of the result from 2.4 (1.6) to 3.6 (2.9) stan- 
dard deviations when using the RF output (dijet mass) 
discriminant. 

In summary, we measure a(WV) — 20.2 ±4.5 pb (with 
V=W or Z) in pp collisions at = 1.96 TeV. The prob- 
ability that the backgrounds fluctuate to give an excess as 
large as observed in data is < 5.4 x 10" 6 , corresponding 
to a significance of 4.4 standard deviations. This repre- 
sents the first evidence for WV production in lepton+jets 
events at a hadron collider. The result is more precise 
than previous independent measurements of WW and 
WZ yields in fully leptonic final states [1, 0] and consis- 



tent with the SM prediction of a(WV) = 16.1±0.9pb [K]. 
This work clearly demonstrates the ability of the DO ex- 
periment to isolate a small signal in a large background 
in a final state of direct relevance to searches for a low 
mass Higgs, and thereby validates the analytical methods 
used in searches for Higgs bosons at the Tevatron [2^ |. 
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Supplemental Material: 



SYSTEMATIC UNCERTAINTIES 



Table IIIII gives the % systematic uncertainties for 
Monte Carlo simulations and multijet estimates. We con- 
sider the effect of systematic uncertainty both on the 
normalization and on the shape of differential distribu- 
tions for signal and backgrounds. Although Table Ifffl 
lists an uncertainty for the VF+jets simulation, this un- 
certainty is not used when measuring the diboson sig- 
nal cross section, for which the PF+jets normalization is 
a free parameter. However, the size of the uncertainty 
must be specified for generating the pseudo-data used in 
the estimation of significance. Also in the table is the 
contribution of each systematic uncertainty to the total 
systematic uncertainty of 3.6 pb on the measured cross 
section, a mcas (WV). This total systematic uncertainty 
is obtained from the systematic uncertainties on the pa- 
rameter in the fit to the Random Forest (RF) output, 
a meas (WV)/a tll (WV), by multiplying each contribution 
by the theoretical cross section a th (WV). The additional 
uncertainty on the integrated luminosity for data (6.1%) 
is therefore considered separately. 



INPUT VARIABLES TO THE RANDOM FOREST 
CLASSIFIER 

The 13 kinematic variables used in the RF classifier 
are listed below, and their distributions are shown in 
Fig. [3l The variables are derived from characteristics 
of objects reconstructed from observables in each event 
and can be loosely classified into three categories: (i) 
variables based on the kinematics of individual objects, 
(ii) variables based on the kinematics of multiple ob- 
jects, and (iii) variables based on the angular relation- 
ships among objects. Several variables are calculated us- 
ing the four-momentum of the dijet system or the lep- 
tonic W candidate (W e "). The dijet system is defined 
as the four-momentum sum of the jets with highest pt 
(jeti) and second highest (jet 2 ). W lv is reconstructed 
from the charged lepton and the Ifr. The neutrino from 
the W— > iv decay is assigned the transverse momentum 
defined by fa and a longitudinal momentum that is cal- 
culated assuming the mass of the W for Iv (M\y = 80.4 
GeV). Of the two possible solutions, we choose the one 
that provides the smaller total invariant mass of all ob- 
jects in the event. 

• Kinematics of Individual Objects: 

1. The imbalance in transverse energy {jfa), 
which is defined by the imbalance in trans- 
verse momentum as determined from the sum- 
ming of products of energies and cosines of 



polar angles of calorimeter cells relative to the 
center of the detector (corrected for transverse 
momenta of muons and energy scales for jets 
and electrons in the event). 

2. The jet with second highest pt- privet?). 

• Kinematics of Multiple Objects: 

1. The "transverse W mass" reconstructed from 
the charged lepton and the fa: — 

^2p T fa (l- cos(A0(*,#r))). 

2. The p T of the W lv candidate. 

3. The invariant mass of the dijet system. 

4. The magnitude of the leading jet momen- 
tum perpendicular to the plane of the di- 

: p4 . Q , rQ t PTr ,. |pr(Jct 1 +Jct 2 )xp r (Jcti) . 

jet system. |^j5tI+J5t^l > wn ere 

"x" represents the usual vector cross prod- 
uct. This variable is calculated in the rest 
frame of the W lv candidate and is denoted 
p£ ei (Dijet, Jeti)) WFrame . 

5. The magnitude of the second-leading jet mo- 
mentum perpendicular to the plane of the di- 

ipf WStPTTV |PT(Jctl+Jct 2 )Xp T (Jct 2 )| rpi • • 

jet system. |p T (j ti+jct 2 )| • ims varl " 

able is calculated in the laboratory frame and 
is denoted p^ el (Dijet, Jet 2 )) LabFramc . 

6. The angular separation between the two 
jets of highest px, weighted by the ra- 
tio of the transverse momentum of the 
second-leading jet and the W tu candidate: 
AiZ(Jeti, Jet a ) ^[1)1% ■ This variable is cal- 
culated in the rest frame of the W lv candidate 

j • i j. j j Min.WFrame 

and is denoted k T 

7. The "centrality" of the charged lepton and jets 
system, defined as the scalar sum of transverse 
momenta divided by the sum of energies of the 
charged lepton and all jets in the event. 

• Angular Relationships of Objects: 

1. The azimuthal separation between the 
charged lepton and the fa vector: 
&.4>{fa, lepton). 

2. The cosine of the angle between the dijet 
system and the leading jet in the laboratory 
frame: cos(Z(Dijet, Jeti)). 

3. The cosine of the angle between the dijet sys- 
tem and the second-leading jet in the labora- 
tory frame: cos(Z(Dijet, Jet 2 )). 

4. Cosine of the angle between lead- 
ing jet and the W lv candidate: 
cos(Z(VF £l/ ,Jet 1 )) Di j° tFramo , evaluated in 
the rest frame of the dijet system. 
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TABLE III: The % systematic uncertainties for Monte Carlo simulations and multijet estimates. Uncertainties are identical 
for both lepton channels except where otherwise indicated. The nature of the uncertainty, i.e., whether it refers to a differ- 
ential dependence (D) or just normalization (N), is also provided. The values for uncertainties with a differential dependence 
correspond to the maximum amplitude of fluctuations in the RF output. Also provided is the contribution of each source to 
the total systematic uncertainty of 3.6 pb on the measured cross section, which does not include the additional uncertainty of 
6.1% for the luminosity. 



Source of systematic 
uncertainty 


Diboson signal 


W+jets 


Z+jets 


Top 


Multijet 


Nature 


Aa (pb) 


niggei eiiicieiicy, evqq cnannei 


4-9 / 9 


4-9 / ^ 


4-9 / ^ 


4-9 / 1 




N 


<~ U.l 


Trigger efficiency, fivqq channel 


i-U/ — 


-|-U/ — 


4-u/ — 






l ) 


<. U.l 


i^epLun lueiiLiiicaLiuii 


4-4 


4-4 


4-4 


4-4 




IN 


< ~ U.l 


Jet identification 


±1 


±1 


±1 


± <1 




D 


0.3 


Jet energy scale 


±4 


±9 


±9 


±4 




D 


1.9 


Jet energy resolution 


±3 


±4 


±4 


±4 




N 


< 0.1 


Cross section 




±20" 


±6 


±10 




N 


1.1 


Multijet normalization, evqq channel 










±20 


N 


0.9 


Multijet normalization, [ivqq channel 










±30 


N 


0.5 


Multijet shape, evqq channel 










±6 


D 


< 0.1 


Multijet shape, [ivqq channel 










±10 


D 


< 0.1 


Diboson signal NLO/LO shape 


±10 










D 


< 0.1 


Parton distribution function 


±1 


±1 


±1 


±1 




D 


0.2 


ALPGEN 7] and AR corrections 




±1 


±1 






D 


< 0.1 


Renormalization and factorization scale 




±3 


±3 






D 


0.9 


ALPGEN parton-jet matching parameters 




±4 


±4 






D 


2.4 



"The uncertainty on the cross section for W+jets is not used in 
the diboson signal cross section measurement (the VF+jets normal- 
ization is a free parameter); however, it is needed for generating 
pseudo-data to estimate the significance of the observed signal. 



CORRELATION BETWEEN DUET INVARIANT 
MASS AND RF OUTPUT 

There is a high degree of correlation between the di- 
jet invariant mass and the RF output. This can be ob- 
served in the dijet invariant mass distributions for events 
with low, intermediate and high values for the RF output 
shown in Fig. |4j As expected, events in the low region of 
the RF output correspond to the background dominated 
sidebands of the dijet invariant mass distribution and 
events in the high region of the RF output correspond to 
the signal resonance region of the dijet invariant mass. 
The purity of the signal in the dijet invariant mass dis- 
tribution is enhanced for high values of the RF output 
because a substantial fraction of the background events 
in the dijet invariant mass signal region has been moved 
to the intermediate region of the RF output. 
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FIG. 3: Distributions of the RF input variables for the combined evqq and \ivqq channels comparing the data with the MC 
predictions, following the fit of MC to data. 
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FIG. 4: Distributions of the dijet invariant mass for the combined evqq and fJ,vqq channels comparing the data with the MC 
predictions for events in three regions of the RF output: (a) < RF output < 0.33, (b) 0.33 < RF output < 0.66 and (c) 
0.66 < RF output < 1. 



